Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

split BQ merge statements and run them individually #1038

Merged
merged 4 commits into from
Jan 11, 2024

Conversation

heavycrystal
Copy link
Contributor

For tables with a lot of columns in general and especially many toast columns, some batches can trigger a MERGE statement so complex that BigQuery is unable to process them, with errors like

The query is too large. The maximum standard SQL query length is 1024.00K characters, including comments and white space characters.
Error 400: Resources exceeded during query execution: The query is too complex., resourcesExceeded

For now, the fix is splitting these complex MERGE statements into smaller ones that act on different subsets of a raw table [partitioning on the basis of _peerdb_unchanged_toast_columns]. This can lead to tables need 10+ MERGE statements in a single batch, but this is a compromise with our current design. Instead of sending MERGEs for all tables at once, we do it per table now and update metadata at the end, to avoid exceeding SQL query length limits.

flow/connectors/utils/array.go Outdated Show resolved Hide resolved
flow/connectors/utils/array.go Outdated Show resolved Hide resolved
flow/connectors/utils/array.go Outdated Show resolved Hide resolved
@@ -16,3 +16,15 @@ func ArrayMinus(first []string, second []string) []string {
}
return result
}

func ArrayChunksGen[T any](slice []T, size int) [][]T {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func ArrayChunksGen[T any](slice []T, size int) [][]T {
func ArrayChunks[T any](slice []T, size int) [][]T {

@serprex serprex self-requested a review January 9, 2024 17:46
@iskakaushik iskakaushik merged commit b5fb6d0 into main Jan 11, 2024
7 checks passed
@serprex serprex deleted the bq-multi-merge-without-tx branch July 19, 2024 15:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants